Claims 



What is claimed is: 

1 . A method for handling errors in adapters used for communication in data processing network 
having at least two nodes connected through a switch, said error handling method comprising the 
steps of: 



detecting a nonpermanent error condition, within an adapter connected to one of said 
nodes, from which recovery is possible from within the node connected to said adapter ; 

suspending communications from within the node with the adapter affected by said error 
condition; 

disabling communication between said affected adapter and said switch so as to provide 
an indication to at least one other node in said network that communication with said affected 
adapter is at least temporarily suspended so as to effectively cause suspension of, but not 
termination of, applications running on said at least one other node in said network; 

performing recovery operations, at said affected node, to restore operation of said affected 
adapter, based on said detected error condition, said recovery including enablement of said 
disabled communication ; and 

resuming communication with said affected adapter upon enablement of said disabled 
communication. 
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2. A method for handling adapter errors in a multinode data processing network in which 
node-to-node communication is at least partially handled by adapters connected to said nodes, 
said adapters operating to pass messages from said nodes through a switch which links the nodes 
in said network, said error handling method comprising the steps of: 

detecting a nonpermanent error condition, within an adapter connected to one of said 
nodes, from which recovery is possible from within the node connected to said error affected 
adapter; 

suspending communication from the node connected to said affected adapter; 

disabling communication between said affected adapter and said switch so as to provide 
an indication to at least one other node in said network that communication with said affected 
adapter is at least temporarily suspended, so as to effectively cause suspension of, but not 
termination of, applications running on said at least one other node in said network ; 

performing recovery operations, at said affected node, to restore operation of said affected 
adapter, based on said detected error condition , said recovery including enablement of said 
disabled communication; 

terminating said running applications on nonaffected nodes in said network upon a 
determination that reestablishment of communication with said affected adapter is taking too 
long; and 
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otherwise maintaining said running applications and restoring communication with said 
affected node after performance of said recovery operations. 

3. The method of claim 2 in which at least one of said applications is running in a window 
environment. 

4. The method of claim 2 in which said suspending step includes fencing at least one 
communication port via which said adapter is connected to said switch. 

5. The method of claim 2 in which said suspending step further includes halting direct memory 
access between said affected adapter and the node to which it is connected. 

6. The method of claim 2 in which recovery operations includes logging operations which are 
carried for said adapter to facilitate error analysis. 
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